# Multimodal Vision Model
Owlv2 Base Patch16 Ensemble
Apache-2.0
OWLv2 is a zero-shot text-conditioned object detection model that can locate objects in images through text queries.
Object Detection
Transformers

O
upfeatmediainc
15
0
Owlv2 Large Patch14 Ensemble
Apache-2.0
OWLv2 is a zero-shot text-conditioned object detection model that can locate objects in images through text queries.
Text-to-Image
Transformers

O
google
262.77k
25
Owlv2 Large Patch14
Apache-2.0
OWLv2 is a zero-shot text-conditioned object detection model that can detect objects in images through text queries without requiring category-specific training data.
Text-to-Image
Transformers

O
google
3,679
5
Owlvit Large Patch14
Apache-2.0
OWL-ViT is a zero-shot text-conditioned object detection model that can retrieve objects in images through text queries.
Text-to-Image
Transformers

O
google
25.01k
25
Owlvit Base Patch16
Apache-2.0
OWL-ViT is a zero-shot text-conditioned object detection model that can detect objects in images via text queries.
Text-to-Image
Transformers

O
google
4,588
12
Featured Recommended AI Models